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MICRO ARRAY METHOD OF GENOTYPING MULTIPLE SAMPLES AT MULTIPLE LOCI 

Mark A. Schena 

TECHNICAL FIELD 

The present invention relates generally to genotyping and more particularly to genotyping 
for disease diagnostics. 

BACKGROUND 

A large number of pathological conditions in humans, animals, and plants are now 
understood at the genetic level. With the announced completion of the mapping of the human 
genome, it is expected that the genetic basis of many more human diseases will be identified in 
the coming years. Analysis of DNA from an individual, therefore, can, in principle, allow 
genetically based conditions to be diagnosed or to be identified in the absence of overt 
symptoms. This is advantageous for many conditions such as metabolic disorders in which early 
diagnosis can prevent serious medical complications later in life. 

Methods of analyzing DNA sequences, which are often referred to genetically as 
genotyping, are known in the art. In very general terms, to determine whether the DNA in a 
sample corresponds to a particular disease condition whose genetic sequence is known, the 
sample is exposed to nucleic acid probes associated with that disease, under conditions that allow 
hybridization. The nucleic acid probes are labeled making it possible to detect whether the 
probes have hybridized to the DNA sample. In one technique, the probes are arranged in arrays 
on chips, with each probe assigned to a specific location. After exposing the array to a labeled 
DNA sample, scanning devices can examine each location in the array and determine whether a 
target molecule has interacted with the probe at that location. Array chips are provided 
commercially, for example, by Affymetrix (Santa Clara, CA) and are described in patents 
assigned to Affymetrix (See, for example U.S. Patent Nos. 6,045,996, 5, 858, 659, and 5,925, 
525, and references therein.). Arrays have also been used for DNA sequencing applications such 
as the Sequencing by Hybridization approaches described, for example, in U.S. Patent Nos. 
6,025,136, 6,018,041, 5,525,464, and 5,202, 231. 
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While methods of genotyping for disease diagnostics are available, in order for the 
methods to be useful in a public health setting, they need to be reasonable in cost. For example, 
although relevant genetic assays are known, neonatal screening is currently done by mass 
spectrometric methods primarily because of cost considerations. Secondly, DNA diagnostics in 
5 a public health setting need to be practical for application to multiple samples and to genetic 
conditions in which mass spectrometric methods are difficult or intractable. The requirement of 
multiple samples may be addressed by using multiple array chips, which are processed 
simultaneously. As described in U.S. Patent No. 5,545,531 to Rava et al., a format including a 
standard 96-well microliter plate containing an array chip at the bottom of each well can be used. 
JLO To perform the same test on many patient samples, each patient sample, in solution, is labeled 
'i? and introduced into a different well, each of which has an identical array chip. Thus, in this 
.lu method, a separate array chip is used for each sample, which may be costly for widespread use 
"i; because of the fixed per-patient costs of arrays, reagents, sample processing, and so forth. 

?1 U.S. Patent No. 5,807,522 to Brown et al. describes a method of screening multiple 

-45 patients against known mutations in a disease gene using multiple microarrays of patient 

y genomic DNA and probe DNA fragments representing all known mutations of a given gene. 

Q The microarrays are fabricated on a sheet of plastic-backed nitrocellulose with silicone rubber 

Li; barrier elements between individual arrays to prevent cross contamination. All microarrays are 

processed as a single sheet of material. However, the method of Brown et al. uses a separate 

20 microarray for each mutated allele or genetic marker screened. 

Thus, there is a need for a method of genotyping with sufficient precision for diagnostic 
use, that is affordable and that provides sufficient throughput for large-scale use. Ideally, such a 
method would allow multiple patients to be screened for multiple diseases in a single assay. 
More generally, the method would allow multiple samples from any source of human, animal, 
25 plant, or microbial material to be screened for alleles at multiple genetic loci in a single assay. 

SUMMARY 

The present invention provides a method for genotyping multiple samples at multiple 
genetic loci in a single assay. According to the method, genomic segments from multiple 
samples are amplified using polymerase chain reaction primers, where each genomic segment 
30 contains a genetic locus, that is, a DNA marker of interest. The genomic segments are formed 
into a microarray on a surface where the material at each location of the surface corresponds 
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essentially to a single genomic segment from a single sample. The microarray is hybridized with 
a mixture of synthetic oligonucleotides that are complementary to the genomic segments on the 
micorarray. Genotyping information for the multiple samples is then derived simultaneously by 
reading the microarray signals. The method can be used for disease diagnostics or to screen for 
5 alleles from any plant or animal species and thus can be used for a broad variety of applications. 

BRIEF DESCRIPTION OF THE DRAWINGS 

The file of this patent contains at least one drawing executed in color. Copies of this 
patent with color drawing(s) will be provided by the Patent and Trademark Office upon request 
and payment of the necessary fee. 

lQr Fig. 1 is a flow chart of a method of genotyping multiple samples at multiple genetic loci 

M= simultaneously according to an embodiment of the present invention. 

S Fig. 2 is a schematic representation of a small portion of a microarray according to an 

y : embodiment of the present invention. 

\: Figs. 3 A and 3B show direct fluorescence signals detected from Cy3 and Cy5 emission, 

151 respectively, from a microarray of 576 features prepared from 72 different patient samples, 

Q according to an embodiment of the present invention, as described in the Example below. The 
microarray signals were read with a confocal scanner at 100% photomultiplier tube (PMT) and 
80% laser settings. A conventional rainbow code is used with red being the most intense and 
black being the least intense. 

20 Figs. 4A and 4B are magnified portions of the data in Figs. 3A and 3B, respectively. 

Letters (a-c) and numbers (1-28) demarcate the location of each of the different patient samples 
as follows: alO-12, sample 1 (S/S); al3-15, sample 2 (A/S); al6-18, sample 3 (S/C); al9-21, 
sample 4 (C/C); a22-24, sample 5 (A/C); a25 -27, sample 6 (A/ A); b 1-3, sample 7 (E/E); b4-6, 
sample 8 (A/E); b7-9, sample 9 (A/A); blO-12, sample 10 (wild type); M3-15, sample 11 

25 (heterozygous); b 16-1 8, sample 12 (homozygous); M9-21, sample 13 (wild type); b22-24, 
sample 14 (heterozygous); b25-27, sample 15 (homozygous). Background subtraction was 
performed using the signal from the negative control printing buffer (a28-30). Positive 
hybridization controls (b28-30, cl-3, c4-6, c7-9 and clO-12) are also shown. The space bar 
corresponds to 1.0 mm. 
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Fig. 5 displays quantitative values of the signals represented by the rainbow code in Fig. 
4A. Triplicate measurements were averaged after background subtraction. The genotypes of 
each patient sample are given above and below the graph. 

Figs. 6A and 6B show signals detected from Cy3 and Cy5 emission, respectively, from a 
microarray prepared with indirect labeling methods from the same set of patient samples as in 
Figs. 3A and 3B, according to an embodiment of the present invention, as described in the 
Example below. The signals were detected as described for Figs. 3A and 3B. 

DETAILED DESCRIPTION 

A method of genotyping obtains information about multiple individuals at multiple 
disease loci simultaneously. As used herein, genotyping is specifically defined as distinguishing 
alleles at a given genetic locus at single nucleotide resolution. A genetic locus (plural loci) is 
defined as a chromosomal location of a genetic or DNA marker. Thus the methods according to 
the present invention have the precision required to provide screening and diagnostic information 
for individuals that can be used as the basis for medical decisions. 

An overview of the method is illustrated diagrammatically in Fig. 1. First, at step 10, 
samples of genomic DNA are isolated from biological specimens. The specimens can be of any 
origin including bacteria, yeast, plants or animals. Any organism that contains DNA is amenable 
to the method. For application to human disease diagnostics, biological specimens are obtained, 
for example, from blood, amniotic fluids, neonatal blood cards, saliva, semen, epithelial scrapes, 
and needle biopsies. For certain organisms, the biological specimens may contain RNA but not 
DNA. The samples are isolated and purified using standard procedures. 

Sample DNA is then amplified with gene-specific primers by use of the polymerase chain 
reaction (PCR) at step 12 to produce the so-called amplicons. The PCR process is broadly used 
and has been described extensively in the art (see for example, U.S. Patent Nos. 4,683,202 and 
4,683,195 and 4,800,159 and 4,965,188 and 5,333,675 and references therein). A specific pair of 
primers is used for each genomic segment of interest, that is for each genomic segment 
containing a known potential mutation or other DNA alteration of interest. The method is 
applicable, therefore, to any disease identified with DNA markers. Such diseases include, for 
example Cystic Fibrosis, Tyrosinemia, Maple Syrup Urine Disease, oc-1 -Antitrypsin Deficiency, 
Glutaric Aciduria Type I, Hereditary Hearing Loss, Beta-Thalassemia, Long Chain 3-Hydroxyl 
Acyl CoA Dehydrogenase Deficiency, Medium Chain Acyl CoA Dehydrogenase Deficiency to 
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name a very few. The method is particularly useful for diseases such as Sickle Cell Anemia and 
Galactosemia wherein a well-studied single mutation in a single gene can produce the disease 
phenotype. The DNA markers could represent any of the known types of DNA alterations 
including mutations, single nucleotide polymorphisms, small deletions and the like. The only 
5 requirement is that the DNA alteration of interest must reside within the primer pairs used to 
generate the amplicons. This ensures that amplicons generated from all of the specimens at a 
given locus, including specimens from homozygotes, heterozygotes and normal individuals, 
amplify with nearly equal efficiency. 

Each biological sample is treated separately with multiple primer pairs to produce 
10 multiple amplicons for each individual, each amplicon associated with a specific genomic 
^ segment from a specific individual, each genomic segment containing a genetic locus of interest. 
rV The length of each amplicon, as the method is currently practiced is about 60 base pairs, 
yj although the method may be applied with amplicons in the range of between about 40 and 1000 
□ base pairs. The total volume of each PCR reaction as is typically practiced currently is about 50 
15= F It is anticipated that further optimization will reduce the minimum volume to 5-10 which 
O will allow the method to provide additional cost savings by minimizing the amount of the PCR 
amplification and purification reagents used for sample preparation. 

;f The genomic segments are purified to remove contaminants such as nucleotides, enzyme, 

primers and other substances that may interfere with microarray printing, attachment or 

20 hybridization. Methods for PCR amplicon purification are available from a host of commercial 
vendors including TeleChem (Sunnyvale, CA) and Qiagen (Valencia, CA). The purified 
amplicons are suspended in buffers such as solutions of sodium chloride and sodium citrate 
(SSC), solutions of dimethyl sulfoxide (DMSO), solutions of sodium chloride, sodium phosphate 
and ethylene diamine tetraacetate (SSPE) or other standard reagents. The buffered amplicons are 

25 arrayed in standard 96-well or 384-well microplates, in step 14, one amplicon solution per well. 
Typical volumes of purified product are about 3-4 \xl per well for the arraying step, although 
product volumes in the range of 2-20 fil are sufficient for forming the microarray s. 

The DNA isolation and PCR processes are readily scaleable in either 96-well or 384-well 
configurations such that >10,000 samples per day are readily achieved in an automated 
30 laboratory setting. This throughput would allow amplification of 10 loci from 240,000 patients 
annually. The method thus enables broad screening of the population as well as other high- 
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throughput applications such as a required for crop breeding in agriculture, forensics, military 
applications, and the like. 

In step 16, a high-density microarray of the genomic segments in the microplate is 
formed on a substrate, with the current size of the microarray occupying about 1.0 cm 2 on the 
5 substrate that is typically the size of a standard 25 mm x 76 mm microscope slide. A typical spot 
diameter is about 100 ym placed at a center-to-center spacing of about 140 |im, to allow each 
spot to form at a distinct and separate location on the substrate. The total number of spots in the 
experiments described in the Example below is 576 spots in the 1.0 cm 2 printed area, though 
scale-up of samples coupled with the current capability of the micro-spotting technology would 
lte allow over 1,000 spots per cm 2 , that is an estimated 5,184 spots per cm 2 such that an 18 mm x 72 
SI mm microarray formed would contain approximately 82,944 spots per 25 mm x 76 mm 
H 8 microarray slide. Thus, the methods of the present invention allow genotyping information to be 
B obtained from multiple individuals simultaneously, that is from at least 10, at least 60, or at least 
y 5,000 individuals simultaneously. In principle, microarrays of every citizen could provide a 
IS permanent gene archive of every person in the population. 

U Each spot in the microarray corresponds essentially to a single amplicon from a single 

2 individual, within the precision of PCR processes. The samples are currently printed in triplicate 
Q at 140 /xm spacing. Triplicate spotting increases the reliability of results, and is shown 

schematically for example in Fig. 2. A first row 26 contains, from left to right, three spots 
20 corresponding to the amplicon from individual number 1 , treated with PCR primer pair A, 

denoted 1A, followed by three spots from individual 1, treated with PCR primer pair B, and so 
on. In Fig. 2, Row 28 shows the amplicons from individual 2, treated with PCR primer pairs, A, 
B, and C There is no requirement, however, that the spots from different samples be placed in 
different rows, though the spots from different amplicon solutions do need to be placed at 
25 distinct locations. There is also no requirement for a triplicate spotting configuration per se and 
single, double or quadruple or other patterns could be used to generate reliable genotyping 
information. 

Currently available technologies for forming microarrays include both contact and non- 
contact printing technologies. One example is the PixSys 5500 motion control system from 
30 Cartesian Technologies (Irvine, CA) fitted with the Stealth Micro-spotting printhead from 

TeleChem (Sunnyvale, CA). Contact printing technologies include mechanical devices using 
solid pins, split pins, tweezers, micro-spotting pins and pin and ring. Contact printing 
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technologies are available commercially from a number of vendors including BioRobotics 
(Boston, MA), Genetix (Christchurch, United Kingdom), Incyte (Palo Alto, CA), Genetic 
Microsystems (Santa Clara, CA), Affymetrix (Santa Clara, CA), Synteni (Fremont, CA), 
Cartesian Technologies (Irvine, CA) and others. The non-contact printing technologies include 

5 "ink-jetting" type devices such as those that employ piezoelectrics, bubble-jets, micro-solenoid 
valves, syringe pumps and the like. Commercial vendors of non-contact printing technologies 
include Packard Instruments (Meriden, CT), Agilent (Palo Alto, CA), Rosetta (Kirkland, WA), 
Cartesian Technologies (Irvine, CA), Protogene (Palo Alto, CA) and others. Both contact and 
non-contact devices can be used on either homemade or commercial devices capable of three- 

10 dimensional movement. Motion control devices from Engineering Services Incorporated 

O (Toronto, Canada), Intelligent Automation Systems (Cambridge, MA), GeneMachines (San 

Carlos, CA), Cartesian Technologies (Irvine, CA), Genetix (Christchurch, United Kingdom), and 

HI others would also be suitable for manufacturing microarrays according to the present invention. 

g The primer pairs used in the PCR reaction in the present method typically contain 

15 reactive groups, such as alkylamine groups, that allow specific attachment of the amplicons to 

microarray substrates, for example, glass substrates, which may be chemically treated. For 
t: example, the substrates may contain reactive aldehyde groups that allow end-attachment of 
C amino-linked PCR products via a Schiffs base, produced as a reaction product. The attachment 
^ reaction proceeds by a dehydration reaction. Hydrophobic printing surfaces such as those that 
20 contain reactive aldehyde groups are useful in preventing sample spreading and therefore 

enabling smaller spot sizes and higher microarray densities. Microarray substrates with reactive 
aldehyde groups are available from a number of vendors including TeleChem (Sunnyvale, CA) 
and GEL Associates (Houston, TX). It will be apparent, however, that any of a number of 
additional microarray surfaces and attachment chemistries could also be employed including 
25 those that contain coatings or treatments of poly-lysine, organosilane, epoxysilane, reactive 

carboxyl groups, gel pads materials, nitrocellulose-coated glass and other substances. It will also 
be apparent that in addition to end-attachment schemes, a number of non-specific schemes 
including cross-linking to the substrate with ultraviolet light or heat, electrostatic interactions, 
hydrophobic interactions and other means may alternatively be used. 

30 At step 18, the microarrays are processed and hybridized with mixtures of labeled 

synthetic oligonucleotides. The microarrays are processed to remove unbound DNA material, 

inactivate unreacted aldehyde groups and denature the printed PCR segments prior to microarray 

hybridization, using conventional protocols (see for example Schena et aL, PNAS 93, 10614- 

-7- 



M-9216US 
649404 vl 



10619, 1996). In general, hybridization reactions are carried out in aqueous solutions containing 
salts and detergent at a temperature about 10 °C below the melting temperature, T m , of the 
synthetic oligonucleotides. The hybridization mixtures consist of synthetic oligonucleotides 
complementary to alleles present in the amplicons on the microarray. That is each synthetic 
5 oligonucleotide in the mixture corresponds to a genetic locus selected by a PCR primer pair. 

According to the method, a virtually unlimited number of different hybridization mixtures could 
be prepared to detect alleles in amplicons of interest from any nucleic acid-containing organism. 
It will also be apparent that the process is scaleable such that mixtures containing dozens, 
hundreds, or possibly thousands of different oligonucleotides could be used to examine many 
10 different alleles of interest and hence many different diseases simultaneously. The only 
3 requirement is having sequence information available for the wild type and altered alleles as well 
F 1 as the bordering gene sequences that are complementary to each PCR primer pair. Synthetic 
oligonucleotides are widely available from a number of vendors including EOS Biotechnology 
(South San Francisco, CA) and Operon Technologies (Alameda, CA). 

£5 The oligonucleotides in the mixture are typically about 10 to 30 nucleotides in length to 

Sj maximize the capacity to distinguish single nucleotide variations within the amplicons. For 
Jl example, the oligonucleotides may be 15 nucleotides in length (15-mers) where the allele of 
3 interest is located at the central position (position 8) relative to the 15-mer. The synthetic 
^ oligonucleotides in the mixture are labeled and labels may reside for example at the 5' end of 
20 each oligonucleotide, though labels at either the 5' or the 3' ends or possibly both would be 
expected to work within the described method. Both direct and indirect labeling methods are 
known in the art. Common fluorescent tags used in direct labeling include the dyes denoted Cy3 
and Cy5 which fluoresce at approximately 550 nm and 650 nm, respectively. The 
oligonucleotide mixture can contain multiple fluorescent tags that fluoresce at multiple 
25 wavelengths. Any number of different types of fluorescent tags could be used in place of the 
Cy3 and Cy5 tags to allow detection of one or more different colors. Multi-color approaches 
would be expected to be useful by allowing, for example, the wild type allele to be detected with 
maximum efficiency in one color and the mutant allele to be detected with maximum efficiency 
in another color. Other types of labels would include a variety of commercial dyes and dye 
30 derivatives such as those that are denoted Alexa, Fluorescein, Rhodamine, FAM, TAMRA, Joe, 
ROX, Texas Red, BODIPY, FITC, Oregon Green, Lissamine and others. Many of these dyes 
and derivatives can be obtained from commercial providers such as Molecular Probes (Eugene, 
OR), Amersham Pharmacia (Bucks, United Kingdom) and Glen Research (Sterling, VT). 
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Indirect labeling methods include, for example, labeling with biotin or dinitrophenol 
which are organic molecules that are not themselves fluorescent, but are reactive with antibody 
conjugates that contain fluorescent groups attached to them. Labels, haptens or epitopes such as 
biotin and dinitrophenol therefore allow fluorescent detection by so-called indirect means 
5 because the fluorescence at each spot is contributed by the antibody conjugate which interacts 
with the microarray via interactions with the non-fluorescent label. Certain antibody conjugates 
contain enzymes such as horseradish peroxidase which catalyze the attachment of short-lived 
Tyramide free radicals to the tyrosine moieties of proteins attached to the microarray surface. 
By linking Tyramide to various fluorescent moieties, it is possible to detect hybridized products 
10 by indirect means involving biotin and dinitrophenol labels, antibody-horseradish peroxidase 
O conjugates and Tyramide-Cy3 and Tyramide-Cy5 derivatives. Anyone skilled in the art will 
01; appreciate however that any number of direct and indirect labeling schemes could be used for 

detection including both fluorescent and non-fluorescent approaches. One alternative fluorescent 
«j* approach would use the Dendrimer technology described by Genisphere (Oakland, NJ). One 
fi alternative non-fluorescent approach would use beads and particles such as described with 
JU Resonance Light Scattering (RLS) particles by Genicon (San Diego, CA). 

N= Following hybridization with the labeled synthetic oligonucleotide mixture, the 

O microarrays are scanned or read by known methods, in step 20, to detect genotyping information. 
u Detection can be performed, for example, using a confocal scanning instrument with laser 
20 excitation and photomultiplier tube detection, such as the ScanArray 3000 provided by GSI 
Lumonics (Bellerica, MA). Alternatively, many different types of confocal and non-confocal 
fluorescent detection systems could be used to implement the method such as those provided by 
Axon Instruments (Foster City, CA), Genetic MicroSystems (Santa Clara, CA), Molecular 
Dynamics (Sunnyvale, CA) and Virtek (Woburn, MA). Alternative detection systems include 
25 scanning systems that use gas, diode and solid state lasers as well as those that use a variety of 
other types of illumination sources such as xenon and halogen bulbs. In addition to 
photomultiplier tubes, detectors could include cameras that use charge coupled device (CCD) 
and complementary metal oxide silicon (CMOS) chips. 

Whether directly labeled or indirectly labeled oligonucleotides are used for hybridization, 

30 the strength of the signal detected from a given microarray spot is directly proportional to the 

degree of hybridization of an oligonucleotide in the mixture to the genomic segment at a given 

spot. The oligonucleotide mixture can contain nucleotides complementary to either the wild type 

or mutant alleles so either wild type or mutant genomic segments can be detected depending on 
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how the hybridizing mixture was prepared. Signals from the identical microarray spots, for 
example, from the three spots labeled 1 A in Fig. 2, are averaged for increased precision and 
therefore to provide small coefficients of variation (CVs). 

A variety of means may be used to obtain and evaluate genotyping information. As 
5 described above, absolute fluorescent signals can be used to determine the allelic composition of 
a given amplicon. Alternatively, one could also use oligonucleotides mixtures with two or more 
colors, with a given color dedicated to a given allele such as wild type as a green fluor and the 
mutant allele as the red fluor. A variety of additional schemes could also be used in conjunction 
with direct labeling such as fluorescent stains to assess the DNA content of each spot. The 
IB. SYBR Green dyes available from Molecular Probes (Eugene, OR) allow detection of stained 
: 5J DNA in the wavelength range of the flourescein isothiocyanate (FITC) dyes. 

The features and benefits of the present invention are further illustrated, but not limited, 
z\ by the following example in which neonatal blood samples were screened for various alleles of 
* Sickle Cell Anemia and Galactosemia. 

M EXAMPLE 

■p- Neonatal blood samples from 72 different newborns were isolated and amplified with 

%«? gene-specific primers denoted ARDC100-I09 in Table 1 below. These five primer pairs contain 
reactive amine groups corresponding to the C6 amino modification from Glen Research 
(Sterling, VT), that allow specific attachment of the amplicons to microarray substrate. The "N" 

20 position in each oligonucleotide sequence in Table 1 below denotes the C6 amino modification. 
The primers pairs encompass five discrete genomic segments corresponding to a total of three 
human genes: 8-globin, CFTR and GAUL The diseases associated with the 8-globin, CFTR 
and GALT genes in human are Sickle Cell Anemia, Cystic Fibrosis and Galactosemia, 
respectively. The genomic segments encompassed five disease loci in the three genes and the 

25 approximate size of each amplicon was 60 base pairs. The total volume of each PCR reaction 
was 50 /xl. 

The genomic segments were amplified and then purified to remove contaminants. A 
384-well PCR purification kit by TeleChem (Sunnyvale, CA) was used according to the 
instructions of the manufacturer. The purified products were re-suspended in 10 /Jtl of sterile, 
30 distilled water and 2 p\ of the lOptl was mixed with 2 pi of 2X Micro-Spotting Solution, 

provided by TeleChem (Sunnyvale, CA), to provide a total of 4 /xl of sample for printing. The 
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concentration of each PCR amplicon in the sample plate was 100 fig/ pi. Each of the 72 samples 
of 4 jLtl each were placed in adjacent wells of the 384-well plate, along with a total of 24 control 
samples containing either printing buffer alone or synthetic oligonucleotides. The 24 control 
samples provided both positive and negative hybridization controls in the experiments. A total 

5 of 96 samples (72 neonatal amplicons and 24 controls were placed in a 384-well microplate such 
that all the wells in the first four rows (A 1-24 through Dl-24) each contained 4 [xl of sample. 
Polypropylene 384-well microplates from Corning Costar (Corning, NY) were used, although 
plates from other vendors such as Whatman Polyfiltronics (Rockland, MA) could alternatively 
be used. The hydrophobic material produces convex sample droplets that tend to have slightly 
10 improved loading and printing efficiency as compared to samples contained in microplates of 

Q materials such as polystyrene though many different types of microplates suffice for sample 

^ holding. 



Table 1. PCR primers used to amplify genomic segments 



Primer I.D. 


Description 


Sequence 


ARDC-100 


Sickle Cell C allele 5' 


5' NAAACAGACACCATGGTG CAC 3' 


ARDC-101 


Sickle Cell C allele 3' 


5' NCCCACAGGGCAGTAACGGCA 3' 


ARDC-102 


Sickle Cell E allele 5' 


5' NGCAAGGTGAACGTGGATGAA 3' 


ARDC-103 


Sickle Cell E allele 3' 


5" NGTAACCTTGATACCAACCTG 3' 


ARDC-104 


Cystic Fibrosis 
AF508 allele 5' 


5' NCTGGCACCATTAAAGAAAAT 3' 


ARDC-105 


Cystic Fibrosis 
AF508 allele 3' 


5' NTTCTGTATCTATATTCATCA 3' 


ARDC-106 


GALT Q188R 5' 


5' NTGGGCTGTTCTAACCCCCAC 3' 


ARDC-107 


GALT Q188R 3' 


5' NAACCCACTGGAGCCCCTGAC 3' 


ARDC-108 


GALTN314D 5' 


5' NCC AC AGG ATC AG AGGCTGGG 3' 


ARDC-109 


GALT N314D 3' 


5' NGGTAGTAATGAGCGTGCAGC 3' 



15 Microarrays of the 72 neonatal samples plus 24 control samples were formed into a 

microarray using a PixSys 5500 motion control system from Cartesian Technologies (Irvine, 
CA) fitted with the Stealth Micro-Spotting Technology from TeleChem (Sunnyvale, CA). The 
Stealth printhead contained a total of 4 printing pins arranged in a 2 x 2 configuration at 4.5 mm 

-11- 



M-9216US 
649404 vl 

center-to-center spacing. The set of 4 pins was used to load and print 4 samples at a time from 
the 384-well microplate. A total of 24 printing cycles (96 samples divided by 4 pins) was used 
to print the 72 neonatal samples and the 24 controls. The total print time was approximately 48 
minutes. 

5 All 96 samples were printed in triplicate (288 total spots) as 100 jLtm spots at 140 fim spot 

spacing such that each of the 4 pins produced a microarray subgrid containing 72 individual 
microarray spots (288 total spots divided by 4 pins). All 96 samples were then re-printed in 
triplicate at a 2 millimeter offset relative to the first microarrays to provide a duplicate set of 
spots for all 96 samples (288 additional spots). The final microarrays each contained a total of 
W 576 microarray spots (288 plus 288) in a total area of about 1.0 cm 2 . A total of 30 microarrays 
were printed on 30 SuperAldehyde Microarray Substrates from TeleChem (Sunnyvale, CA) 
according to the instructions of the manufacturer, to allow for a variety of different hybridization 
£!: mixtures and optimizations to be performed. Although 30 microarray substrates were printed in 
3T this example, it may be noted that several of the commercial printing systems, including the 
|5 technology from ESI (Toronto, Canada), allow up to 120 substrates to be printed in a single run. 

A single microarray is sufficient to yield the genotyping information with a single hybridization 
Q mixture, and multiple microarrays allow a given set of samples to be analyzed with different 
II hybridization mixtures. 

Following the printing step, the microarrays were allowed to dry overnight at room 

20 temperature on the platten of the microarraying device and then processed to remove unbound 

DNA material, inactivate unreacted aldehyde groups and denature the printed PCR segments 

prior to microarray hybridization. The processing steps were as follows: soak twice in 0.2% 

SDS for 2 minutes at room temperature with vigorous agitation, soak twice in distilled H 2 0 for 2 

minutes at room temperature with vigorous agitation, treat substrates for 2 minutes in distilled 

25 H2O at 100°C to allow DNA denaturation, allow substrates to air dry for 5 minutes at room 

temperature, treat substrates for 5 minutes in sodium borohydride solution, prepared by 

dissolving 1.2 g NaBH4 in 330 ml phosphate buffered saline (PBS), add 120 ml 100% ethanol to 

reduce bubbling, rinse substrates three times in 0.2% SDS for 1 minute each at room 

temperature, rinse substrates once in distilled H 2 0 for 1 minutes at room temperature, submerge 

30 slides in distilled H 2 0 at 100°C for 5 seconds, allow the slides to air dry and store in the dark at 

room temperature. It should be noted that the because the sodium borohydride solution is a 

highly reactive reducing agent, it is prepared fresh just prior to use to ensure that the unreacted 

aldehyde groups on the surface are reduced with high efficiency. 
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Hybridization mixtures were prepared using synthetic oligonucleotides manufactured by 
the commercial provider EOS Biotechnology (South San Francisco, CA). Each synthetic 
oligonucleotide was complementary to an allele present in a specific amplicon. The alleles for 
the neonatal examples corresponded to disease loci of interest. To demonstrate direct detection, 
a mixture of 15-mers containing Cy3 or Cy5 labels, denoted as Mixture 1 in Table 2 below, was 
used. The Cy3 and Cy5 labels in mixture 1 of Table 2 are denoted by the numbers "3" and "5 M 
respectively in each oligonucleotide sequence. To demonstrate indirect detection, a mixture of 
15-mers containing biotin or dinitrophenol labels, denoted as Mixture 2 in Table 2 was used. 
The biotin and dinitrophenol labels in mixture 2 of Table 2 are denoted by the letters "B" and 
"D" respectively in each oligonucleotide sequence. The synthesis scale was 10 nmoles for all the 
oligonucleotides listed in Table 2 and each oligonucleotide was suspended in distilled H 2 0 at a 
concentration of 100 jtxM just prior to use. Mixture 1 was prepared by making a 50 fi\ solution 
containing a 2 /xM concentration of each of the ten oligonucleotides (Table 2, ARDC110-119) in 
a buffer of 5X SSC (0.75M sodium chloride, 0.075 M sodium citrate) and 0.2% SDS (sodium 
dodecyl sulfate). Mixture 2 was prepared in the same manner as mixture 1 except that the ten 
oligonucleotides were ARDC 125-129 and ARDC135-139 (Table 2). 

Hybridization reactions were performed using 10 fx\ of Mixture 1 or Mixture 2 per 
microarray. The 10 jxl mixture was applied to the microarray under a cover slip measuring 18 
mm x 18 mm x 0.2 mm. Hybridizations were performed for 5.5 hours at 42°C in a hybridization 
cassette according to the instructions of the manufacturer TeleChem (Sunnyvale, CA). 
Following the 5.5 hour hybridization, the microarrays were washed to remove unhybridized 
material as follows: twice for 5 minutes in 2X SSC (0.3M sodium chloride, 0.030M sodium 
citrate) and 0.2% SDS (sodium dodecyl sulfate) at 25°C, and once for 1 minute in 2X SSC (0.3M 
sodium chloride, 0.030M sodium citrate) at 25°C. 
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Table 2. Mixtures of synthetic oligonucleotides 



lviixiure 




^jiigonucieoiiue sequence 


i 


ARDC-110 


3GACTCCTG(A/T)GGAGAA 




ARDC-111 


5GACTCCTA(A/T)GGAGAA 




ARDC-112 


5TGGTGGTGAGGCCCT 




ARDC-113 


3TGGTGGTAAGGCCCT 




ARDC-114 


3ATCATCTTTGGTGTT 




ARDC-115 


5TATCATCGGTGTTTC 




ARDC-116 


5CACTGCCAGGTAAGG 




ARDC-117 


3CACTGCCGGGTAAGG 




ARDC-118 


3CAACTGGAACCATTG 




ARDC-119 


5CAACTGGGACCATTG 


2 


ARDC-125 


BGACTCCTG(A/T)GGAGAA 




ARDC-126 


BTGGTGGTAAGGCCCT 




ARDC-127 


BATCATCTTTGGTGTT 




ARDC-128 


BCACTGCCGGGTAAGG 




ARDC-129 


BCAACTGGAACCATTG 




ARDC-135 


DGACTCCTA(A/T)GGAGAA 




ARDC-136 


DTGGTGGTGAGGCCCT 




ARDC-137 


DTATCATCGGTGTTTC 




ARDC-138 


DCACTGCCAGGTAAGG 




ARDC-139 


DCAACTGGGACCATTG 



All sequences shown are 5* to 3 'from left to right 3 denotes Cy 3; 5 denotes Cy5; B 
denotes biotin; D denotes dinitrophenol. 



Following the hybridization and wash steps, the microarrays were detected for 
genotyping information. For the direct labeling experiments involving Mixture 1, the detection 
step was performed by scanning the microarray for fluorescence emission immediately following 
the wash step. Detection was performed using the ScanArray 3000 confocal scanning instrument 
from GSI Lumonics (Bellerica, MA) with settings of 100% for the photomultiplier tube (PMT) 
and 80% for the laser settings. The two-color capability of the scanner was used to detect 
fluorescent microarray signals in both the Cy3 and Cy5 channels corresponding to hybridization 
of the Mixture 1 oligonucleotides. Results are shown in Figs 3A and 3B where Fig. 3A 
corresponds to detection of fluorescence from Cy3 and Fig. 3B corresponds to detection of 
fluorescence from Cy5. The data are presented in a conventional rainbow scale, with red being 
the most intense and black being the least intense; the space bar corresponds to 1.0 mm. A 
magnified view of a portion of the microarray in Fig. 3A is shown in Fig. 4A. Microarray 
signals appear in triplicate because each amplified neonatal patient sample or control sample was 
printed three times at adjacent microarray locations. Sample locations in the microarray are 
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designated by letters along the y axis (vertical direction) and numbers along the x axis 
(horizontal direction). Quantitation of the fluorescent microarray signals was performed using 
ScanArray Software from GSI Lumonics (Bellerica, MA). Values corresponding to the 
microarray signals are plotted in Fig. 5. The data reveal that wild type, heterozygotes and 
5 homozygotes are readily distinguished in all of the examples examined from both the Sickle Cell 
and Galactosemia loci. Coefficients of variations (CVs) were <10% for all the triplicate 
measurements. 

Genomic segments from three samples of three individuals that differ at B-globin locus 
232, for example, are present at microarray locations bl-3; b4-6; and b7-9, respectively. The 
\Q three individuals are designated by genotypes of E/E, A/E and A/A, respectively. The E/E 
5 ; neonate is homozygous for the mutant allele which is a single nucleotide change from G to A at 
position 232 in the beta-globin sequence, the A/E neonate has one mutant allele and one normal 
S allele at 232 and is thus heterozygous, and the A/A neonate has two normal alleles at 232 (i.e. 
J both alleles contain a G residue at position 232 in beta-globin). The corresponding synthetic 
15 oligonucleotide in the hybridization mixture (ARDC1 13, Table 1) is perfectly complementary to 
both alleles of the E/E neonate, perfectly complementary to one allele of the A/E neonate and 
contains a one nucleotide mismatch to the other allele in the A/E neonate, and contains a one 
3 nucleotide mismatch to both alleles in the A/A neonate. As expected, the microarray signal 
u intensities at locations bl-3; b4-6 and b7-9 show a decreasing signal intensity consistent with the 
20 genotypes of the neonatal samples at each of the microarray locations. The results for the 
remaining samples reveal similar results and are tabulated in Figure 5. 

In a second experiment, the indirect labeling approach was demonstrated using Mixture 2 
oligonucleotides. The microarrays were hybridized and washed exactly as for the direct labeling 
experiments involving Mixture 1 except that the microarrays were stained using a MICROMAX 

25 staining kit according to the instructions of the manufacturer NEN Life Sciences (Boston, MA). 
The staining kit uses antibody conjugates that separately recognize the biotin and dinitrophenol 
epitopes and use horseradish peroxidase (HRP) to catalyze the deposition of tyramide Cy3 and 
tyramide Cy5 onto the microarray surface. Detection was made as for the direct labeling 
approach using a ScanArray 3000 by GSI Lumonics (Bellerica, MA) in both the Cy3 and Cy5 

30 channels. Similar to the results with the direct labeling approach, the method enabled by indirect 
labeling revealed microarray signals from which accurate genotyping information was derived as 
illustrated in Figs. 6A, 6B and 4B. 
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It will be apparent from the foregoing that the present method provides a novel means of 
genotyping that is a significant improvement over current methods. The method allows for the 
first time, genotyping of multiple patients and multiple loci in a single assay. A key feature of 
the method is that each microarray spot represents a single genetic segment or locus from a 
single patient, thereby allowing high specificity between the amplified sequence and the 
synthetic oligonucleotide in the hybridization mixture. The capacity to test thousands or tens of 
thousands of patients for multiple diseases in a single microarray step provides the immediate 
use of the method for neonatal screening, for example, that represents a significant savings of 
time and expense. The method should allow neonatal screening for a cost of less than ten dollars 
($10 U.S.) per disease locus and thus is immediately amenable to widespread commercial 
application. The capacity to screen at an early stage for inborn genetic diseases will have an 
immediate beneficial impact on human health. 

Although the invention has been described with respect to specific examples, the 
description is only an example of the invention's applications and should not be taken as a 
limitation. Various adaptations and combinations of the features of the examples disclosed are 
within the scope of the invention as defined by the following claims. 
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CLAIMS 



I claim: 



1. 



A method of simultaneously genotyping multiple samples, the method 



comprising: 



amplifying genomic segments from a plurality of samples using polymerase chain 
reaction primers, each genomic segment comprising a genetic locus; 

forming a microarray on a surface wherein material at each location on the 
surface corresponds essentially to a single genomic segment from a single sample; 

hybridizing the microarray with a mixture of synthetic oligonucleotides, wherein 
the mixture comprises oligonucleotides complementary to the genomic segments; and 

deriving genotyping information for multiple samples simultaneously by 
detecting signals from the hybridized microarray. 

2. The method of Claim 1 wherein the polymerase chain reaction primers comprise a 
plurality of distinct polymerase chain reaction primers such that the genomic segments comprise 
distinct genetic loci and genotyping information is derived simultaneously for multiple genetic 
loci from multiple samples. 

3. The method of Claim 1 wherein the plurality of samples comprises at least 10 
distinct samples. 

4. The method of Claim 3 wherein the plurality of samples comprises at least 5,000 
distinct samples. 

5. The method of Claim 1 wherein the genomic segments comprise human disease 



loci. 



6. 



The 



method of Claim 5 wherein the samples are neonatal blood samples. 



7. 



The 



method of Claim 5 wherein the genetic loci comprise genetic loci associated 
selected from the group consisting of (3-globin, CFTR, and GALT. 



with a human gene 
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8. The method of Claim 1 wherein the density of the microarray on the surface is at 
least 1000 spots per square centimeter. 

9. The method of Claim 1 wherein the mixture of synthetic oligonucleotides 
5 comprises ten different oligonucleotide sequences. 

10. The method of Claim 1 wherein the synthetic oligonucleotides are between about 
10 and about 30 nucleotides in length. 

10 11. The method of Claim 1 wherein the genomic segments each comprise between 

O about 40 and about 1000 base pairs. 

fj 12. The method of Claim 1 wherein hybridizing is performed in an aqueous solutions 

;J comprising salts and detergent. 

ft 

13. The method of Claim 1 wherein hybridizing is performed at a temperature about 

^ 10 °C below the melting temperature of the synthetic oligonucleotides. 

ir 14. The method of Claim 1 wherein the synthetic oligonucleotides comprise 

20 fluorescent labels. 

15. The method of Claim 1 wherein the synthetic oligonucleotides comprise non- 
fluorescent labels. 

25 16. The method of Claim 1 wherein the genotyping information distinguishes samples 

from homozygotes and samples from heterozygotes at a specific genetic locus. 

17. The method of Claim 14 wherein the signals are generated by fluorescence 
emission from the labeled oligonucleotides. 

30 

18. The method of Claim 14 wherein the signals are generated by fluorescence 
emission at more than one wavelength of light. 
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19. The method of Claim 15 wherein the signals are generated by fluorescence 
emission after antibody staining. 

20. The method of Claim 15 wherein the signals are generated by fluorescence 
emission at more than one wavelength of light after antibody staining. 

21. The method of Claim 1 wherein the surface comprises glass. 

22. The method of Claim 1 wherein the amplified genomic segments comprise amino 

linkers. 

23. The method of Claim 22 wherein the surface comprises reactive aldehyde groups. 

24. The method of Claim 1 wherein the microarray is formed by mechanical micro- 
spotting. 
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MICRO ARRAY METHOD OF GENOTYPING MULTIPLE SAMPLES AT MULTIPLE LOCI 

Mark A. Schena 

ABSTRACT OF THE DISCLOSURE 

A method for genotyping multiple samples at multiple genetic loci in a single assay is 
provided. Microarrays of genomic segments representing discrete loci are formed and 
hybridized with mixtures of synthetic oligonucleotides that are complementary to the genomic 
segments. Genotyping information is derived by reading the microarray signals. The method 
can be used to characterize samples from diverse biological sources and for a variety of 
applications. 
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Fig. 4B 
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Neonatal Patient Sample 



Number 
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Allele/Genotype 
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S/S 


10 


Wild type 
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A/S 


11 


Heterozygous 
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S/C 


12 


Homozygous 
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c/c 


13 


Wild type 


5 


A/C 


14 


Heterozygous 


6 


A/A 


15 
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7 


E/E 






8 


A/E 






9 


A/A 
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